Constructing a spoken dialogue corpus for studying paralinguistic information in expressive conversation and analyzing its statistical/acoustic characteristics

نویسندگان

  • Hiroki Mori
  • Tomoyuki Satake
  • Makoto Nakamura
  • Hideki Kasuya
چکیده

The Utsunomiya University (UU) Spoken Dialogue Database for Paralinguistic Information Studies is introduced. The UU Database is especially intended for use in understanding the usage, structure and effect of paralinguistic information in expressive Japanese conversational speech. Paralinguistic information refers to meaningful information, such as emotion or attitude, delivered along with linguistic messages. The UU Database comes with labels of perceived emotional states for all utterances. The emotional states were annotated with six abstract dimensions: pleasant-unpleasant, aroused-sleepy, dominant-submissive, credible-doubtful, interested-indifferent, and positive-negative. To stimulate expressively-rich and vivid conversation, the “4-frame cartoon sorting task” was devised. In this task, four cards each containing one frame extracted from a cartoon are shuffled, and each participant with two cards out of the four then has to estimate the original order. The effectiveness of the method was supported by a broad distribution of subjective emotional state ratings. Preliminary annotation experiments by a large number of annotators confirmed that most annotators could provide fairly consistent ratings for a repeated identical stimulus, and the inter-rater agreement was good (W ≃ 0.5) for three of the six dimensions. Based on the results, three annotators were selected for labeling all 4840 utterances. The high degree of agreement was verified using such measures as Kendall’s W. The results of correlation analyses showed that not only prosodic parameters such as intensity and f0 but also a voice quality parameter were related to the dimensions. Multiple correlation of above 0.7 and RMS error of about 0.6 were obtained for the recognition of some dimensions using linear combinations of the speech parameters. Overall, the perceived emotional states of speakers can be accurately estimated from the speech parameters in most cases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study of the Relationship between Acoustic Features of “bæle” and the Paralinguistic Information

Language users benefit from special phonetic tools in order to communicate linguistic information as well as different emotional aspects and paralinguistic information through daily conversation. Having functions in conveying semantic information to listeners, prosodic features form the essential part of linguistic behavour, manipulating  them potentially can play an important role in transmitt...

متن کامل

Paralinguistic effects on turn-taking behavior in expressive conversation

Speaker and paralinguistic properties of dialogue speech that affect the timing at turn changes are investigated by analyzing the UU Database for Paralinguistic Information Studies. The results showed a large variation among speakers and a strong interaction with partner in pause/overlap duration. In addition, perceived emotional states of utterances had significant effects on the pause/overlap...

متن کامل

CASIA-CASSIL: a Chinese Telephone Conversation Corpus in Real Scenarios with Multi-leveled Annotation

CASIA-CASSIL is a large-scale corpus base of Chinese human-human naturally-occurring telephone conversations in restricted domains. The first edition consists of 792 90-second conversations belonging to tourism domain, which are selected from 7,639 spontaneous telephone recordings in real scenarios. The corpus is now being annotated with wide range of linguistic and paralinguistic information i...

متن کامل

Recognition of Paralinguistic Information using Prosodic Features Related to Intonation and Voice Quality

Besides the linguistic (verbal) information conveyed by speech, the paralinguistic (nonverbal) information, such as intenning the classification of paralinguistic information. Among the several paralinguistic items extions, attitudes and emotions expressed by the speaker, also convey important meanings in communication. Therefore, to realize a smooth communication between humans and spoken dial...

متن کامل

Morphology of vocal affect bursts: exploring expressive interjections in Japanese conversation

Expressive interjection (EI) is defined as non-lexical speech sound which indicates the speaker’s cognitive/affective state changes. It is a type of vocal affect burst, i.e., brief and sudden nonverbal expressions that are produced spontaneously and unconsciously. Although EI as a social signal is assumed to play an important part in speech communication, very little is known about its linguist...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 53  شماره 

صفحات  -

تاریخ انتشار 2011